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Machine learnin g in automated text cate g orization 
Fabrizio Sebastiani 

March 2002 ACM Computing Surveys (CSUR), Volume 34 issue l 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citin gs, index 
terms 



Full text available: f£| pdf( 524.41 KB) 



The automated categorization (or classification) of texts into predefined categories has 
witnessed a booming interest in the last 10 years, due to the increased availability of 
documents in digital form and the ensuing need to organize them. In the research 
community the dominant approach to this problem is based on machine learning 
techniques: a general inductive process automatically builds a classifier by learning, from 
a set of preclassified documents, the characteristics of the categories. ... 
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2 Buildin g task-specific interfaces to hi g h volume conversational data 
sfc, Loren G. Tervfeen, William C. Hill, Brian Amento, David McDonald, Josh Creter 
^ March 1997 Proceedings of the SIGCHI conference on Human factors in computing 
systems 
Publisher: ACM Press 

Full text available: ^ pdf ( 908.00 KB) Additional Information: full citation , references , citin gs, index terms 



Keywords: Netnews, Usenet, World Wide Web, collaborative filtering, computer- 
supported cooperative work, data mining, human interface, human-computer interaction, 
organizatinal computing, resource discovery, social filtering 



WSQ/DSQ: a practical a p proach for combined quer ying of databases and the Web 
Roy Goldman, Jennifer Widom 

May 2000 ACM SIGMOD Record , Proceedings of the 2000 ACM SIGMOD international 

conference on Management of data SIGMOD '00, volume 29 issue 2 
Publisher: ACM Press 

Full text available: « pdf( 223.65 KB) Additional Information: full citation , abstract, references, citings, index 

terms 

We present WSQ/DSQ (pronounced "wisk-disk"), a new approach for combining the query 
facilities of traditional databases with existing search engines on the Web. WSQ, for Web- 
Supported (Database) Queries, leverages results from Web searches to enhance SQL 
queries over a relational database. DSQ, for Database-Supported (Web) Queries, uses 
information stored in the database to enhance and explain Web searches. This paper 
focuses primarily on WSQ, describing a simple, lo ... 



Fast detection of communication patterns in distributed executions 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced 
Studies on Collaborative research 

Publisher: IBM Press 

Full text available: ^)_ pdf(4.21 MB ) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the 
execution of the application. The visualization tool we use is Poet, an event tracer 
developed at the University of Waterloo. However, these diagrams are often very complex 
and do not provide the user with the desired overview of the application. In our 
experience, such tools display repeated occurrences of non-trivial commun ... 

A method of g eo gra phical name extraction from Japanese text for thematic 
geogra phical search 
Yasusi Kanada 

November 1999 Proceedings of the eighth international conference on Information 
and knowledge management 

Publisher: ACM Press 

Full text available: ^ pdf( 1.28 MB ) Additional Information: full citation , abstract , references , index terms 

A text retrieval method called the thematic geographical search method has been 
developed and applied to a Japanese encyclopedia called the World Encyclopaedia. In this 
method, the user specifies a search theme using free words, then obtains a sorted list of 
excerpts and hyperlinks to encyclopedia sentences that contain geographical names. 
Using this list, the user can also open maps that indicate the locations of the names. To 
generate an index of names for this searching, a method of ... 

Web-based personalization and mana g ement of interactive video 
Rune Hjelsvold, Subu Vdaygiri, Yves Leaute 

April 2001 Proceedings of the 10th international conference on World Wide Web 
Publisher: ACM Press 

Full text available: l p3pdf (611.20 KB ) Additional Information: full citation , references , citings, index terms 
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Model-driven development of Web a p plications: the AutoWeb s ystem 
Piero Fraternali, Paolo Paolini 

October 2000 ACM Transactions on Information Systems (TOIS), Volume 18 issue 4 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citin gs, index 



Full text available: TO pdf( 6.94 MB ) 

terms 

This paper describes a methodology for the development of WWW applications and a tool 
environment specifically tailored for the methodology. The methodology and the 
development environment are based upon models and techniques already used in the 
hypermedia, information systems, and software engineering fields, adapted and blended 
in an original mix. The foundation of the proposal is the conceptual design of WWW 
applications, using HDM-lite, a notation for the specification of structure, nav ... 
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9 Im pact of software en g ineerin g research on the practice of software confi g uration 
mana g ement 

Jacky Estublier, David Leblang, Andre van der Hoek, Reidar Conradi, Geoffrey Clemm, Walter 
Tichy, Darcy Wiborg-Weber 

October 2005 ACM Transactions on Software Engineering and Methodology (TOSEM), 

Volume 14 Issue 4 

Publisher: ACM Press 

Full text available: *gj pdf (350.59 KB ) Additional Information: full citation , abstract , references , index terms 

Software Configuration Management (SCM) is an important discipline in professional 
software development and maintenance. The importance of SCM has increased as 
programs have become larger, more long lasting, and more mission and life critical. This 
article discusses the evolution of SCM technology from the early days of software 
development to the present, with a particular emphasis on the impact that university and 
industrial research has had along the way. Based on an analysis of the publicati ... 

Keywords: Versioning, data model, process support, research impact, software 
configuration management, software engineering, workspace management 



0 Information retrieval on the web 
*k Mei Kobayashi, Koichi Takeda 

June 2000 ACM Computing Surveys (CSUR), volume 32 issue 2 

Publisher: ACM Press 

Additional Information: full citation , abstract , references , citin gs, index 



Full text available: fPl pdf (213.89 KB ) 

Ll - J " terms 

In this paper we review studies of the growth of the Internet and technologies that are 
useful for information search and retrieval on the Web. We present data on the Internet 
from several different sources, e.g., current as well as projected number of users, hosts, 
and Web sites. Although numerical figures vary, overall trends cited by the sources are 
consistent and point to exponential growth in the past and in the coming decade. Hence it 
is not surprising that about 85% of Internet user ... 

Keywords: Internet, World Wide Web, clustering, indexing, information retrieval, 
knowledge management, search engine 



1 A multi-paradi g m quer ying ap proach for a g eneric multimedia database mana g ement 
^ s ystem 

* Ji-Rong Wen, Qing Li, Wei-Ying Ma, Hong-Jiang Zhang 
March 2003 ACM SIGMOD Record, volume 32 issue l 

Publisher: ACM Press 

Full text available: ^ pdf (524.08 KB ) Additional Information: full citation , abstra ct, references , citings 

To truly meet the requirements of multimedia database (MMDB) management, an 
integrated framework for modeling, managing and retrieving various kinds of media data 
in a uniform way is necessary. MediaLand is an experimental MMDB platform being 
developed at Microsoft Research Asia for users with different levels of experiences and 
expertise to manage and search multimedia repositories easily, efficiently, and 
cooperatively. Key features of MediaLand include a uniform data model for describi ... 

Keywords: media independence, multi-paradigm querying, multimedia database 
management, uniform data modeling 



2 Axis-specified search: a fine-g raine d full-t ext search me th od for g atherin g and 
% structuring excerpts 
* Yasusi Kanada 

May 1998 Proceedings of the third ACM conference on Digital libraries 



Publisher: ACM Press 

Full text available: W\ pdff 1.35 MB) Additional Information: full citation , references , citin gs, index terms 



13 Enhanced h y pertext cate g orization usin g hy perlinks 
Soumen Chakrabarti, Byron Dom, Piotr Indyk 

June 1998 ACM SIGMOD Record , Proceedings of the 1998 ACM SIGMOD international 

conference on Management of data SIGMOD '98, Volume 27 issue 2 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citin gs, index 



Full text available: fq pdff 1.91 MB) 

terms 

A major challenge in indexing unstructured hypertext databases is to automatically 
extract meta-data that enables structured search using topic taxonomies, circumvents 
keyword ambiguity, and improves the quality of search and profile-based routing and 
filtering. Therefore, an accurate classifier is an essential component of a hypertext 
database. Hyperlinks pose new problems not addressed in the extensive text classification 
literature. Links clearly contain high-quality semantic clues that ... 

14 NSF workshop on industrial/academic cooperation in database systems 
<% Mike Carey, Len Seligman 

>^ March 1999 ACM SIGMOD Record, volume 28 issue l 
Publisher: ACM Press 

Full text available: f£| pdf f 1.96 MB) Additional Information: full citation , index terms 



15 Pen computin g ; a technolo g y overview and a vision 
Andre Meyer 

>^ July 1995 ACM SIGCHZ Bulletin, volume 27 issue 3 
Publisher: ACM Press 

Full text available: Q pdf ( 5.14 MB ) Additional Information: full citation , abstract , citin gs, index terms 

This work gives an overview of a new technology that is attracting growing interest in 
public as well as in the computer industry itself. The visible difference from other 
technologies is in the use of a pen or pencil as the primary means of interaction between 
a user and a machine, picking up the familiar pen and paper interface metaphor. From 
this follows a set of consequences that will be analyzed and put into context with other 
emerging technologies and visions. Starting with a short historic ... 

16 Columns: Risks to the public in computers and related s ystems 
j&fy Peter G. Neumann 

^ January 2001 ACM SIGSOFT Software Engineering Notes, Volume 26 issue l 
Publisher: ACM Press 

Full text available: pdf( 3.24 MB) Additional Information: full citation 



17 A tour throu g h Ta pestry 
Douglas B. Terry 

December 1993 Proceedings of the conference on Organizational computing systems 
Publisher: ACM Press 

Full text available: f?\ pdf ( 989.92 KB) Additional Information: full citation , references , citings, index terms 



Keywords: appraisers, collaborative filtering, electronic mail, highlighting, information 
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18 Clustering cate gori cal data: an a ppro ach based on dyn amical systems 
David Gibson, Jon Kleinberg, Prabhakar Raghavan 



February 2000 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 8 Issue 3-4 

Publisher: Springer-Verlag New York, Inc. 

Full text available: ^ pdf ( 21 3.68 KB ) Additional Information: full citation , abstract, citin gs, index terms 

We describe a novel approach for clustering collections of sets, and its application to the 
analysis and mining of categorical data. By "categorical data/' we mean tables with fields 
that cannot be naturally ordered by a metric - e.g., the names of producers of 
automobiles, or the names of products offered by a manufacturer. Our approach is based 
on an iterative method for assigning and propagating weights on the categorical values in 
a table; this facilitates a type of similar ... 

Keywords: Categorial data, Clustering, Data mining, Dynamical systems, Hypergraphs 



19 Mining and disambi g uatin g names: Findin g authoritative people from the web 

Masanori Harada, Shin-ya Sato, Kazuhiro Kazama 
>^ June 2004 Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries 

Publisher: ACM Press 

Full text available: ^ pdf(260.06 KB ) Additional Information: full citation , abstract , references , index terms 

Today's web is so huge and diverse that it arguably reflects the real world. For this 
reason, searching the web is a promising approach to find things in the real world. This 
paper presents NEXAS, an extension to web search engines that attempts to find real- 
worldentities relevant to a topic. Its basic idea is to extract proper names from the web 
pages retrieved for the topic. A main advantage of this approach is that users can query 
any topic and learn about relevant real-world entities without ... 

Keywords: proper name extraction, question answering, text analysis, web mining 



20 Concept mapping vs. web page hy perlinks as an information retrieval interface: 
preferences of post g raduate culturally diverse learners 
Melius Weideman, Wouter Kritzinger 

September 2003 Proceedings of the 2003 annual research conference of the South 

African institute of computer scientists and information technologists 
on Enablement through technology SAICSIT '03 

Publisher: South African Institute for Computer Scientists and Information Technologists 

Full text available: pdf(328.41 KB) Additional Information: full citation , abstract , references , index terms 

The principal objective of this research project was to determine if and to what extent 
cultural factors prescribe interface choices by learners. Concept mapping and standard 
hyperlinks were offered as choices for information retrieval interfaces. The methods 
employed were to identify a set of culturally divisive factors, and then to test two different 
interfaces with a group of culturally diverse, advanced learners. Some of the results had 
to be ignored due to small sample sizes. The remaining ... 

Keywords: concept mapping, cultural diversity, experimentation, higher education, 
human factors, hyperlinks, information retrieval, measurement 
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